


^ I QfiBce l 




INVESTOR IN PEOPLE 



0 9 JUL 2004 



The Patent Office 



PRIORITY 
DOCUMENT 



Concept House 
Cardiff Road 
Newport 
South Wales 



SUBMITTED OR TRANSMITTED IN 
COMPLIANCE Wrra RULE 17.1(a) OR (b) 



NPIO {!QQ 



REC D 2 1 FEB 2003 



PCT 



I, the undersigned, being an officer duly authorised in accordance wifli Section 74(1) and (4) 
of the Deregulation & Contracting Out Act 1994, to sign and issue certificates on behalf of the 
Comptroller-General, hereby certify that annexed hereto is a true copy of the documents as 
originally filed in connection with the patent application identified therein. 



In accordance with the Patents (Companies Re-registration) Rules 1982, if a company named 
in this certificate and any accompanying documents has re-registered under the Companies Act 
1980 with the same name as that with which it was registered immediately before re- 
registration save for the substitution as, or inclusion as, the last part of the name of the words 
"public limited company" or their equivalents in Welsh, references to the name of the company 
in this certificate and any accompanying documents shall be treated as references to the name 
with which it is so re-registered. 



In accordance with the rules, the words "public limited company" may be replaced by p.Lc, 
pic, P.L.C. or PLC. 



Re-registration under the Companies Act does not constitute a new legal entity but merely 
subi|cts the company to certain additional company law rules. 




An Executive Agency of the Department of Trade and Industry 



Dated 



Signed 




BEST AVAILABLE COPY 



I- 



ttents Form 1/77 



Patents Act 1977 
CRulel6) 



THE PATEWT OFFICEi 

1olil2002 

RULE 97 





(See tlie notes on the back of this fornu You can also get 
an explanatory leaflet from the Patent Office to help 
youfttt in this form,) 



Fee: £0 



The Patent Office 

Cardiff Road 
Newport 

Gwent NP9 IRH 



1. Your reference 



43952/JTyiD/MAR 



2. Patent applicatinn 
f77. 



0200689.8 



10 JAN 2002 



POimOO OcOO-0200i589.S 



3. Ful — ^ ouu postcode of the or of 

each applicant (underline all surnames) 



University of Newcastle Upon Tyne 
6 Kensington Terrace 
Newcastle Upon Tyne 
NEl 7RU 



Patents ADP number (if you know it) 

If the applicant is a corporate body, give the 
coimtry/state of incorporation 



UNITED KINGDOM 




4. Title of the invention 



Fusion Proteins 



Full name, address and postcode in the United 
Kingdom to which all correspondence relating 
to this form and translation should be sent 



Reddie & Grose 
16 Theobalds Road 
LONDON 
WCIX 8PL 



Patents ADP number (if you know it) 



91001 



6. If you are declaring priority from one or more 
earlier patent applications, give the country 
and the date of fQmg of the or of each of these 
earUer applications and (if you know it) the or 
each application number 



Country 



Priority application 
(ffyou know it) 



Date of filing 
(day/month/year) 



7. If this application is divided or otherwise „ ^ ^ Date of filing 
derived from an earUer UK application. of earlier application (day/momhfyear) 
give the number and the filing date of 

die earlier q>plication 

8. Is a statement of inventorship and of right 
to grant of a patent required in support of 
this request? (Answer 'Yes' if: 

o) any applicant named in part 3 is not an inventor, or 

b) there is an inventor who is not named as an ^® 
applicant, or 

c) any named applicant is a corporate body. 
See note (d)) 



Patents Form 1/77 



latents Form 1/77 



Enter the mmbrar of sheets for any of the 
following items you are filing with this form. 



Continuation sheets of this fonn 




Description 


19 


Claim (ij 


5 


Abstract 


0 


Drawingf5j 


6 




10. If you are also filing any of the following, 
state how many against each it^oi. 



Priority documents 

Translations of priority documents 

Statement of inventorship and rigjit 
to grant of a patent (Patents Form 7/77) 

Request for preliminary examination 
and search (Patents Form 9/77) 

Request for substantive examination 
(Patents Form 10/77) 

Any other documents 
(please specify) 



0 
0 



11. 


I/We request the grant of a patent on the basis of this application. 

Signature Date 
n i f 9 Jsuauary 2002 


12. Name and daytime telephone number of 
person to contact in the United Kingdom 


S J N GOODMAN 
01223 360350 



Mer an application for a patent has been filed, the Comptroller of the Patent Office mil consider whether publication or commimicationq 
the inveJon should be prohibited or restricted under Section 22 of the Patents 1977. You be informed 'f''''"%^J^'y'''P'^^' 
or restrict your invention in this wiiy. Furthermore, you ave in the United Kingdom. Section23qfthePatents Act 1977 st<^ps you Jrcm 
applying for a patent abroad without first getting written permission firom the Patent Office unless an tq^hcation has been filed at least 6 
weeks beforehand in the United Kingdom for a patent for the same invention and either no direction prohibiting pttbhcation or 
communication has been given, or such direction has been revoked. 

Notes 

a) ffyou need help to fill in this form or you have any questions, please contact the Patent Office on 0645 500505. 

b) Write your answers in capital letters using black ink or you may type them. 

c) If there is not enough space for all the relevant details on any part of this form, please continue on a separate sheet of 
paper and write "see continuation sheet" in the relevant part(s). Any continuation sheet should be attached to this form. 

d) ffyou have answered 'Yes' Patents Form 7/77 will need to be filed. 

e) Once you have filled in the form you must remember to sign and date it. 

f) For details cf the fee and ways to pay please contact the Patent Office. 



DUPLICATE 



FUSION PROTEINS 



The present invention relates to fusion proteins (fusion polypeptides), particularly for use in 
expression and/or purification systems. 

Purified proteins are required for several applications. However, the isolation of pure proteins, 
in sufficient quantities, is sometimes problematic. For protein function studies, large amounts 
of a protein of interest (for example, a mutated protein) are often needed. Various expression 
systems have been used for heterologous production of proteins. Escherichia coli (E. coli) is 
still the most common host despite huge advances in the area of protein expression in the last 
ten years in other hosts. E. coli is popular because expressing proteins in the bacterium is 
relatively simple and a vast amount of knowledge about bacterium itself exists, and 
(sometimes most importantly) because of the low costs associated with production. 



Proteins can be expressed in E. coli eitiher directly or as fusions (of a "fusion partner" and a 
protein or polypeptide), also known as fusion proteins. The purpose of fusion partners is to 
provide affinity tags (e.g. Hisn tag, glutathione-S-transferase, cellulose binding domain, intein 
tags), to make proteins more soluble (e.g. glutathione-S-transferase), to enable formation of 
disulphide bonds (e.g. thioredoxin), or to export fused proteins to the periplasm where 
conditions for the formation of disulphide bonds are more favourable (e.g. DsbA and DsbC). 
Proteins used as fusion partners are normally small (less than 30 kDa). 

TolA is a periplasmic protein involved in (1) maintaining the integrity of the inner membrane 
and (2) the uptake of colicins and bacteriophages. The first function is evidenced by the 
increased outer membrane instability (e.g. SDS sensitivity) of TolA- mutants. The second 
function is based upon the use of tolA as a receptor by phage proteins (Lubkowski, J. et al, 
1999, Structure With Folding & Design 7: 711-722) and colicins (Gokce, I. et ah, 2000, J. 
Mol. BioL 304: 621-632). This has been revealed both by the phage/colicin resistance of tolA 
mutants and by direct demonstration of the tolA -protein interactions by physical methods. 
ToLA. is composed of three domains. A short N-temiinal domain is composed of a single 
transmembrane helix, which anchors TolA in the iimer membrane. The second, largest domain 
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is polar and mainly a-helical. A C-terminal donaain HI (TolAIH) is small and composed of 92 
amino acids. Its 3D stracture was recently solved in a complex with Nl domain of minor coat 
gene 3 protein of Ff filamentous bacteriophage (HoUiger, P, et al, 1999, J. Mol. Biol, 288: 
649-657). It is tightly folded into a slightly elongated protein with the aid of one disulphide 
bond (Figure !)• Various homologues of the TolA protein are known, for example firom E. coli 
(SwissProt Acc. No. P19934), Salmonella species (for example Gehbank Acc. Nos 
gil6764117 and gil675986), Pectobacterium species (for example Genbank Acc. No. 
gil61 16636) snd Haemophilus species (for example Genbank Acc. No. gi2126342). 

The present inventors have found that the TolAIQ domain has remarkable properties which are 
of particular use as a fusion protein partner to achieve high levels of expression in a host cell. 

According to the present invention there is provided a fusion polypeptide for expression in a 
host cell comprising a TolAM domain or a functional homologue, fragment, or derivative 
thereof and a non-TolA polypeptide excluding a polypeptide consisting of or comprising an 
amino acid sequence corresponding to residues 1-86 of mature phage minor coat gene 3 
protein g3p.. 

Lubkowski et al (1999; supra) disclose a fusion protem comprising residues 1-86 (the Nl 
domain) of the filamentous Ff bacteriophage minor coat gene 3 protein g3p towards the N- 
tenninus and residues 295-425 (which include the TolADI domain) of TolA, a coreceptor of 
g3p, towards the C-terminus, and a C-temiinal AlasHise tail. The fusion protein was used by 
Lubkowski et al to elucidate the crystal structure of a complex formed between the g3p Nl 
and TolAm domains. 

Further provided according to the present invention is a fusion polypeptide for expression in a 
host cell comprising a ToLAin domain or a functional homologue, fragment, or derivative 
thereof and a non-ToLA polypeptide, wherein the TolAHI domain or functional homologue, 
fragment, or derivative thereof is located towards the N-terminus of the fusion polypeptide 
and the non-TolA polypeptide is located towards the C-terminus of the fusion polypeptide. 
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Alternatively, the TolAin domain or fiinctional homologue, fragment, or derivative thereof 
may be located towards the C-terminus of the fiision polypeptide and the non-TolA 
polypeptide excluding a polypq)tide consisting of an antiino acid sequence corresponding to 
residues 1-86 of mature phage minor coat gene 3 protein g3p may be located towards the N- 
terminus of the fusion polypeptide. 

As used herein, the terms '^polypeptide" and 'protein" are synonymous and refer to a sequence 
of two or more linked amino acid residues. 

The TolAin domain has been shown by the present inventors to faciUtate higher than expected 
levels of TolAIH fusion polypeptide expression in a host cell. The TolAIE domain fusions will 
be useful, for example, for obtaining purified protein and polypeptide partners and/or for 
studying the properties of these partners. 

The TolAm domain or functional homologue, fragment, or derivative thereof may be codon- 
optimised for expression in the host cell. 

The fusion polypeptide may fiirther comprise a linker between the TolAUI domain or 
functional homologue, fragment, or derivative th^eof and the non-TolA polypeptide. The 
linker may provide a physical separation between the ToLAHI domain or functional 
homologue, fragment, or derivative thereof and the non-ToLA polypeptide or may be 
frmctional. The linker may comprise at least one cleavage site for an endopeptidase. For 
example, the cleavage site may comprise the amino acid sequence DDDDK and/or LVPR 
and/or lEGR. 

In one embodiment, the fusion polypeptide according to invention may further comprise an 
aflBnity purification tag. The afiBnity purification tag may be located at the N-tenninus of the 
fusion polypeptide. For example, the affinity purification tag is an N-terminal Hisn tag, with 
n=4, 5, 6, 7, 8, 9 or 10 (preferably 6), optionally with the Hisn tag Unked to the fusion 
polypeptide by one or more Ser residues (preferably 2). The affinity purification tag will 
provide one means for immobilising the fusion polypeptide, for example as a step in 
purification. 
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Preferrably, the TolAHI domain consists of amino acid residues 329-421 of Escherichia coli 
TolA sequence (SwissProt Acc. No. P19934). 

The host cell may be bacterial (for example, Escherichia coli). 

Further provided according to the present invention is a DNA molecule encoding the fusion 
polypeptide as defined above. The mRNA properties of the DNA molecule when transcribed 
may be optimised for expression in the host cell. 

Also provided is an expression vector comprising the DNA molecule as defined above for 
expression of the fusion polypeptide of the invention. The expression vector may have an 
inducible promoter (for example, the IPTG-inducible T7 promoter) which drives expression 
of the fusion polypeptide. The expression vector may also have an antibiotic resistance marker 
(for example, the bla gene, which confers resistance to ampicillin and chloramphenicol). 

Li another aspect of the invention there is provided a cloning vector for producing the 
expression vector as defined above, comprising DNA encoding the TolAHI domain or a 
functional homologue, fragment, or derivative thereof upstream or downstream from a cloning 
site which allows in-fi-ame insertion of DNA encoding a non-TolA polypeptide. The cloning 
vector may further comprise DNA encoding at least one cleavage site (for example, the amino 
acid sequence DDDDK and/or LVPR and/or BBGR) for an endopeptidase, the cleavage site 
located between the DNA encoding the TolAm domain or a functional homologue, firagment, 
or derivative thereof and the cloning site. The cloning site may comprise at least one 
restriction endonuclease (for example, BamHl and/or Kpnl) target sequence. The cloning 
vector may further comprise DNA encoding an affinity purification tag as defined above. The 
cloning vector may further comprise an inducible promoter (for example, the IPTG-inducible 
T7 promoter) and/or DNA encoding an antibiotic resistance marker (for example, the bla 
gene, which confers resistance to ampicillin and chloramphenicol). 

For example, the cloning vector may structure of pTolE, pTolT or pTolX (as shown in Figure 
2 with refermce to the description). 
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Also provided is the use of the TolAHI domain or functional homologue, fragment, or 
derivative thereof for production of a fusion polypeptide as defined above. 
Further provided is the use of the TolAHI domain or functional homologue, fragment, or 
derivative thereof for production of the DNA molecule as defined above. 

Yet further provided is the use of the TolAEI domain or functional homologue, fragment, or 
derivative thereof for production of an expression vector as defined above. 

Also provided is the use of the TolArH domain or functional homologue, fragment, or 
derivative thereof for production of a cloning vector as defined above. 

In one aspect there is provided a host cell containing the DNA as defined above and/or the 
expression vector as defined above and/or the cloning vector as defined above. 

Li anotiier aspect there is provided the use of the fusion polypeptide as defined above for 
immobilisation of Ihe non-TolA polypeptide, comprising the step of: 

binding the fusion polypq)tide to a ToIA binding polypeptide (eg. the TolA-recognition site of 
colicin N [Gokce et al., 2000, supra] or other colicins, the TolA binding region of 
bacteriophage g3p-Dl protein [Riechmann & Holliger, 1997, Cell 90: 351-360], or the TolA 
binding region of ToDB or other Tol proteins). 

It is known that ToLAm interacts specifically with several naturally occurring proteins such as 
colicins, phage proteins and other Tol proteins. This range of existing binding partners makes 
the over expression of TolAHI fusion proteins of particular utility since these proteins may be 
used in purification or immobilisation technologies. The TolAHI domain therefore not only 
drives high expression of the fusion polypeptide but also provides an affinity tag for 
purification, immobiUsation or analysis of the fusion polypeptide. The TolAUI binding 
proteins (or binding polypeptide domains thereof) could be used to provide binding sites for 
the TolAm fusions (as in Figure 6). Protein chips could be made using these TolAIH protems 
which then bind the TolADI fusion proteins. This provides a way to immobilise a wide variety 
of proteins on the surface using the TolAin fusion as the common interaction. 
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Alternatively, the fusion polypeptide comprising an affinity tag as defined above may be used 
for immobilisation of the non-TolA polypeptide, comprising the step of: 
binding the afiSnity tag of the fusion polypeptide to a binding moiety. 

Also provided is the use of the fusion polypeptide as defined above for purification and 
isolation of the non-TolA polypeptide, comprising the steps of: 

binding the fusion polypeptide to a TolA binding polypeptide (eg. the TolA-recognition site of 
colicin N or other colicins, the TolA binding region of bacteriophage g3p-Dl protein, or the 
TolA binding region of TolB or other Tol proteins); 

cleaving the non-TolA polypeptide firom the TolADI domain or functional homologue, 
fi^gment, or derivative thereof using an endopeptidase; and 

separating the cleaved non-TolA polypeptide firom the TolAm domain or functional 
homologue, firagment, or derivative fliereof. 

In an altemative embodiment, the fusion polypeptide comprising an affinity tag may be used 
for purification and isolation of the non-TolA polypeptide, comprising the steps of: 
binding the affinity tag of the fusion polypeptide to a binding moiety; 

cleaving the non-TolA polypeptide firom the TolAIH domain or functional homologue, 
firagment, or derivative thereof using an endopeptidase; and 

separating tiae cleaved non-TolA polypeptide firom the TolAin domain or functional 
homologue, firagment, or derivative thereof. 

The fusion polypeptide as disclosed herein may be used for studying interaction properties of 
the non-TolA polypeptide or the fusion polypeptide, for example self-interaction, interaction 
with another molecule, or interaction with a physical stimulus. 

Also provided is a method for high expression of a polypeptide as a fusion polypeptide in a 
host cell, comprising the step of expressing the polypeptide as a fusion polypeptide as defined 
above in a host cell. Levels of expression of a polypeptide as a fusion protein defined herein 
will be high relative to levels of expression of a polypeptide not linked to the TolAIH domain. 
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The invention will be further described with reference to the accompanying figures. Of the 
figures: 

Figure 1: ^Pnor art) Shows the structure and sequence of third domain of TolA. The 
model is fiom the crystal structure of complex between TolAIEE and Nl domain of minor coat 
gene 3 protein firom filamentous bacteriophage (Hoffiger et al, 1999, supra). Disulphide bond 
is labelled black. Residues 333-421 were resolved in the model. Below is the sequence of the 
TolAm domain used in the present study (amino acids 324-421 of TolA protein; TolA protein 
sequence from SwissProt Acc. No. P 1 9934). 

Figure 2: Shows pTol expression vectors. pTol vectors are T7 based expression vectors 
derived from pETSc. The TolAin with tags, depicted in the middle panel, is inserted in 
between Xhol and Mlul sites. His6-Ser2 linker precedes gene for domain m, coding for amino 
acids 329-421 of TolA (middle panel). Short flexible part (Gly-Gly-Gly-Ser) then follows and 
the cleavage site for endopeptidases composed of four or five amino acids (denoted by X in 
middle panel and underlined in bottom panel). The cleavage site is denoted by an arrow. Stop 
codons are shown as asterisks. 

Figure 3: Characterization of TolAHI expression. A: SDS-PAGE of expressed TolAm 
from using three different vectors. Lane 1, pTolT uninduced; lane 2, pTolX; lane 3, pTolE; 
lane 4, pTolT. B: Growth curve of bacteria with pTolT. Uninduced (solid squares) sample, 
induced (open squares) sample. 1 mM IPTG was added to induce sample at the time denoted 
by an arrow. C: SDS-PAGE of fractionation of bacteria after expression of TolAm from 
pTolT. Lane 1, uninduced sample; lane 2, induced bacteria; lane 3, periplasmic fraction; lane 4, 
cytoplasmic fraction; lane 5, insoluble (membrane + inclusion bodies) fraction. M, molecular 
weigjht marker. 

Figure 4: Expression of different proteins in E.coli using pTol system. A: Expression of 
fusion of TolAm with prokaryotic proteins. Lane 1, coUcin N 40-76; lane 2, AlO T-domain 
colicin N; lane 3, R-domain colicin N. Bottom panel presents an estimation of proportion of 
expressed protein in bacterial cells as determined from scaimed gels with Tina. Values 
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reported represent average of estimation from 5-11 colonies ± SD. B: Expression of fusion of 
TolAm with eukaryotic proteins. Lane 1, PDK2; lane 2, NBDl domain; lane 3, Eqtll; lane 4, 
PLA2- Values in bottom are average of estimation from 4-8 colonies ± SD. C: Expression of 
fusion of TolAm with membrane proteins. Lane 1, uninduced pTolT; lane 2, induced BcrC; 
lane 3, induced TMl. The position where expressed BcrC and TMl should ^pear on the gel 
is denoted by an asterisk and circle, respectively. M, molecular weight marker; C, control of 
bacterial cells from vminduced sample of pTolT. 

Figures: Purification of R-domain of colicin N. Lane 1, uninduced cells containing 
pTolT-Rdomain vector; lane 2, induced cells; lane 3, bacterial cytoplasmic fraction; lane 4, 
flowthrough of Ni-NTA chromatography; lane 5, purified fusion TolT-Rdomain proteins; lane 
6, purified R domain after cleavage and ion-exchange cBromaltbgrapHy. 

Figure 6: Depicts diagrammatically various uses of a His-tagged fusion protein. A 
ToimA CTol") fusion partner with a Hise affinity tag is attached to a non-TolAm polypeptide 
(depicted as a circle). The non-TolAIH polypeptide may be removed from the fusion protein 
by endopeptidase cleavage (depicted as a lightening bolt) and purified. The fusion protein may 
be immobilised to a Nickel Chelate substrate via the His6 tag or (as shown) using an 
immobilised tag made from all or part of a recognised TolAin binding protein from bacteria 
or phage, allowing the non-TolADI polypeptide (or the entire fusion) to be available for 
interaction studies. 
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EXPERIMENTAL 

In our laboratory we first prepared fusion proteins between domain lH of periplasmic TolA 
protein (TolAIH) and T domain of colicin N. Huge amounts of fusion protein was isolated 
when TolAin was at the N-tenninus and T-domain at the C-tenninus. On the other hand, 
when coUcin N was the N-tenninal partner no expression of fusion protein was obtained. 
Although having TolAin as an N-tenninal partner was preferred in the fusion protein with the 
Tdomaia of colicin N, it is possible that TolAHI could be usefid as either a N or C-terminal 
partner in a fusion protein with other proteins. 

Here we describe cloning of pTol vectors that use TolAIE as a fusion partner at the N-teraiinal 
part of expressed fusion protein. We show that leveb of e^tpression of various fusion proteins 
are aroimd 20 % of total bacterial proteins and we were able to purify 50-90 mg of fusions per 
1 of bacterial broth. We prepared different components of colicin N by the use of this system. 

Several proteins have been expressed using the system. These were different parts and 
domains of colicin N (TolA binding box (peptide of amino acids 40-76), deletion mutant of T- 
domain (AID) and R domain), representing prokaryotic proteins. Human phospholipase A2, 
pore-forming protein from sea anemone equinatoxin n, nucleotide binding domain 1 (NBDl) 
of human cystic fibrosis transmembrane conductance regulator (CFTR) and human 
nodtcchondrial pyruvate dehydrogenase kinase 2 (PDK2) were examples of eukaryotic 
proteins. Transmembrane proteins were represented by BcrC, a component of bacitracin 
resistance system firom B. licheniformis^ and transmembrane domain 1 (TMl) of human 
CFTR. 

hi all cases except for two membrane proteins the yields effusion protein were higher than the 
individual proteins. The expression of small peptides and soluble proteins was consistently 
good. More difficult targets were also chosen .The membrane proteins did not express at all. 
The human PLA, PDK2 and equinatoxin expressed well but as in the case of the individual 
proteins much ends up as insoluble firaction. PLA has many SS bonds and PDK has 
consistently resisted soluble expression in other systems. The TolAHI was not able to 
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overcome the insoluble behaviour of these fusion partners but their recovery from inclusion 
bodies is still possible. 

MATERIALS AND METHODS 

Cloning of pToi vectors: 

The original vector used in cloning was a derivative of pET3c (Novagen) termed pET8c. The 
pETSc vector was constructed by adding to the pET3c vector nucleotides encoding methionine 
followed by six histidine and two serine residues downstream of the cloning site (Politou, A.S. 
et al, 1994, Biochemistry 33ri5^: 4730-4737). The pET8c vector was used for an expression 
of fusion betweea domain m of TolA (amino acids 329-421; SwissProt Acc. No. PI 9934) 
protein and T domaia of colicin N. It is T7 based expression vector with bla gene, providing 
ampicillin selection. The fusion protein contains a methionine followed by six histidiues and 
two serines at the N-terminal part. This linker enables easy purification using Ni-chelate 
affinity chromatography. The fusion partners were linked together via BamSL site. The C- 
terminal end of the fusion was cloned via Mlul site. TheT-domain gene was removed fix)m the 
vector by restricting it witii BamHi and Mlul. An adaptor sequence was then ligated into the 
vector. It was composed in such a way that it removed the BamHl site within the flexible 
linker, but introduced a new BamHl site just after the cleavage sequence for endopeptidases 
(Figure 2). In this way fused partners can be cloned in pTol vector via BaniHl or Kpnl site, 
leaving tag of two (Gly-Ser) or four (Gly-Ser-Gly-Thr) amino acids, respectively, at the N- 
terminus (see Figure 2). Lmker between TolAHI and fused partner is, therefore, composed of 
flexible part (Gly-Gly-Gly-Ser) and cleavage sequence for endopeptidases (enterokinase, 
factor Xa or thrombin) (Figure 2). The oligonucleotides (all oligonucleotides were &cm MWG 
Biotech) with the following sequences were used as an adaptors: 
E(+) (5'-GATCTGATGATGACGATAAAGGATCCGGTACCTGATGAA-3') and 
E(-) (5'-CGCGTTCATCAGGTACCGGATCCTTTATCGTCATCATCA-3') for 

enterokinase, 

X(+) (5'-GATCTATTGAAGGTCGCGGATCCGGTACCTGATGAA-3') and 
X(-) (5'-CGCGTTCATCAGGTACCGGATCCGCGACCTTCAATA-3') for factor Xa, and 
T(+) (5'-GATCTCTGGTTCCGCGCGGATCCGGTACCTGATGAA-3') and T(-) (5'- 
CGCGTTCATCAGGTACCGGATCCGCGCGGAACCAGA-3') for thrombin cleavage sites. 
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Newly cloned vectors were named pTolE, pTolX, pTolT and they comprise cleavage 
sequences for enterokinase, factor Xa, and thrombin, respectively. Fusion partners used to test 
the system were cloned into the pTol vectors via BamHl and Mlul sites. If the nucleic acid 
sequence coding for a particular protein contained internal BamHl site, a Kpnl site was used 
instead. Nine different proteins were used to test the system (Table 1}. Coding sequences were 
amplified by PGR. Reaction mixture contained (in 100 ^il total volume): 10 fil of 10 X 
reaction buffer supplied by the producer, 2 fil of 100 mM MgS04, 4 ^il of dNTP mix (200 |jM 
final concentration), 100 pmol of each oligonucleotide, approximately 20 ng of target DNA 
and 1 Unit of Vent DNA polymerase (New England BioLabs), Typically the following cycles 
were used: 10 nwn at PT^'C; 30 cycles, each composed of 2 min denaturation at 97°C, 1 min of 
annealing at 58°C, 1 min of extension at 72*^0; 7 min at 72*=*C and soak at 10°C. PGR 
fragments were purified usmg commercial kits (Qiagen) and restricted by an appropriate 
restriction endonucleases. Restricted firagments were cloned into pre-<:leaved pTol vector. The 
correct nucleotide sequence of the fiision protein was verified by sequencing. 



Table 1: Proteins used to test pTol fusion expression system: 



Protein 


Amino acids / 
SwissProt Acc No. 




Plasmid 


Cloning'' 
Site 


Oligos for 
PCR 


CoHcinN 40-76 


40-76 /P08083 


16038 


pToffi, T, X 


^aniHI 


la 


ColicinN AlO T-domain 


11-9O/P08083 


18567 


pTolT 


Bamm 


3,4 


Colicin N R domaia 


67-183 /P08083 


24667 


pTolT 


BamHl 


5,6 


Human PLA2 


21-144 /NP_000291.1** 


25810 


pTolT 


Kpnl 


7,8 


Eqiiinatoxin 11 


36-214 /P17723 


31575 


pTolE 


Bamm 


9,10 


NBDl domain of human CFTR 


460-650 /P13659 


33134 


pTolT 


BamHl 


11,12 


Human PDK2 


18-407 /Q15119 


56193 


pTolT 


Kpnl 


13,14 


BcrC 


2-203 /P42334 


34775 


pTolT 


BamHl 


15,16 


TMl domain of human CFTR 


2-355 /P13659 


52590 


pTolT 


BamHl 


17,18 



* Mw of fusion protein calculated from the sequence. Restriction site used for cloning at the N-terminal part of 
the fusion protein. In all cases C-tenninal site used was Ailul. " RefSeq accesion number. 



Oligonucleotides to amplify the desired proteins were of the following sequences (all 5'-3'; 
see also Table 1): 

1. TTTTTGGATCCAATTCCAATGGATGGTCATGGAG 
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2. AAGGATCCAAGCTTCAAGGTTTAGGCTTTGAATTATTGTCC 
3 XTTTTGGATCCAATGCTTTTGGTGGAGGGAAAAATC 

4. CTCAGCGGTGGCAGCAGCC 

5. CGCGGATCCCATGGGGACAATAATTCAAAGC 

6. GGCGAATTCACGCGTrAAAATAATAATTTCTGGCTCAC 

7. CCGGGGTACCAATTTGGTGAATTTCCACAGAATGATC 

8. GGCGAATTCACGCGTTAGCAACGAGGGGTGCTCCC 

9. CGCGGATCCGCAGACGTGGCTGGCGCC 

10. GGCGAATTCACGCGTTAAGCTTTGCTCACGTGAGTTTC 

11. CGCGGATCCTCTAATGGTGATGACAGCCTC 

12. GGCGAATTCACGCGTTAGAAAGAATCACATCCCATGAG 

13. CCGGGGTACCAAGTACATAGAGCACTTCAGCAAGTTC 

14. GGCGAATTCACGCGTTACGTGACGCGGTACGTGGTCG 

15. CGCGGATCCTTTTCAGAATTAAATATTGATG 

16. GGCGAATTCACGCGTTAAAAGTTCTTCGATTTATCG 

17. CGCGGATCCCAGAGGTCGCCTCTGG 

18. GGCGAATTCACGCGTTAGGGAAATTGCCGAGTGAC 

Expression of proteins in E. coli 

All proteins were expressed in an coli BI21^E3)pLysE strain (from Novagen). The strain 
was transformed with plasmid and grown on LB plates with appropriate selection (Ampicillin, 
Chloramphenicol). One colony was used to inoculate 5 ml of LBAC medium (Ampicillin at 
100 (ig/ml. Chloramphenicol at 34 p,g^l, hoth from SIGMA). Bacteria were grown on 
rotating wheel at 37''C. After 60 min the expression of recombinant proteins was induced by 
an addition of 1 mM (final) IPTG and bacteria were grown for additional 4 h. Small samples 
(corresponding to a volume of bacteria which when resuspended in 1 ml yields A«)o=0.5) was 
analysed on SDS-PAGE. Gels were stained with Coomassie and scanned at 600 dpi using 
commercial scanner. The amount of expressed proteins was estimated from the gels using the 
program Tina 2.0. For large-scale expression, 5 ml of bacterial culture in stationary phase was 
used to inoculate 250 ml of LBAC medium and grown at 37°C in orbital shaker at 180 ipm 
overnight The next morning 20-25 ml of overnight culture was used to inoculate 500 ml of 
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M9 LBAC medium. In total 3-5 1 of bacterial culture were grown for a single protein. Bacteria 
were grown at the same conditions imtil Aeoo reached approximately 0.8. Then the production 
of recombinant proteins was induced by adding BPTG to final 1 mM concentration. Bacteria 
were grown for additional 4-5 h, centrifiiged for 5 min at 5000 ipm at 4°C, and stored at - 
20^C. 

Isolation of proteins from bacteria 

Pelleted bacteria were resuspended (2 ml of buffer / g of cells) in 50 wM NaH2P04, pH 8.0, 
300 mM. NaCl, 10 mM imidazole, 20 mM P-mercaptoethanol (buffer A), with following 
enzymes and inhibitors of proteases (jSnal concentrations): DNase (10 lag/ml), RNase (20 
\ig/wlX lysozyme (1 mg/ml of buffer), PMSF (0.5 mM), benzamidine (ImM). They were 
incubated on ice for an hour and occasionally vigorously shaken. The resuspended bacteria 
were sonicated for 3 min with a Branson sonicator and then centrifiiged in a Beckman ultra- 
centrifuge at 40000 rpm, 4®C in 45ti rotor. Supernatant was removed and placed at 4**C. Pellet 
was resuspended in the same buffer without enzymes and inhibitors (1 ml / g of weight) and 
kept on ice for 15 min . Centrifugation at the same conditions followed after additional 1 min 
of sonication. Supematants from both centrifugations were merged and applied at 
approximately 1 ml/mm to 1-3 ml of Ni-NTA resin (Qiagen) equiUbrated with buffer A. 
Typically, column with bound protein was washed with two firactions of 3 ml of buffer A, two 
fractions of buffer A with 20 mM imidazole and 6-10 fractions of buffer A with 300 mM 
imidazole. Fractions were analysed on SDS-PAGB. Fractions of interest were pooled and 
dialysed three times against water (5 1) at 4°C. Purity was checked by SDS-PAGE. Proteins 
were stored at 4**C in 3 mM NaNa. Protein concentration was determined by using extinction 
coefficients calculated from the sequence. 

Fractionation of bacterial proteins 

All bacterial proteins were fractionated in order to see the amount of insoluble expressed 
proteins. Pelleted bacteria from 100 ml of brotti were resuspended in 40 ml of 20 % sucrose, 1 
mM EDTA, 30 mM Tris-HCl, pH 8.0 and incubated 10 min at room temperature. They were 
centrifiiged at 9000 ^ for 10 min at 4'^C. Supernatant was removed and pellet was gently 
resuspended in 8 ml of ice-cold 5 noM MgS04. Bacteria were gently shaken and incubated on 
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ice for 10 min. Bacterial protoplasts were centrifiiged again at the same conditions. 
Supernatant was removed as periplasmic jQraction. PeUet was resuspended in 10 ml of 20 mM 
NaH2P04, pH 8.0, with 1 mg of lysozyme and benzamidine. It was shaken vigorously and 
incubated on ice for 30 min, and finally, sonicated 5 x 30 s. Cytoplasmic proteins were 
removed firom insoluble material by centrifug^tion at 35 000 g at 4*^0 for 30 min. Supernatant 
was removed as cytoplasmic fraction and pellet was resuspended in 2 ml of 8 M urea, 10 mM 
Tris-HCl, pH 7.4, 0.5 % Triton X-100 as insoluble fraction (membrane proteios and putative 
inclusion bodies). 

Cleavage and purification ofToLUn-R-domain colicin Nfiision 

Pure R-^omain of colicin N was produced using the pTol expression system. 45 mg of 
TolAHI-R-domain was incubated in 35 ml of cleavage mbcture at 20®C for 20 h. Cleavage 
mixture contains buffer as specified by producer and thrombin (Restriction grade, Novagra) at 
0.1 U/mg of fiised protein. Cleaved products were dialj^ed three times against 5 1 of 40 mM 
Tris-HCl, pH8.4 at 4°C, each time at least 4 h. Cleaved R domain was separated from TolAm 
and uncleaved fiision protein by ion-exchange chromatography on FPLC system (Pharmacia). 
Proteins were applied to Mono S column (Pharmacia) at 1 ml/min in 40 mM Tris-HCl, pH8.4. 
After unbound material was washed from the column, R-domain was eluted by applying 
gradient of NaCl from 0 to 500 mM in the same bufifer in 30 min. Large peak at approximately 
70% of NaCl (app. 350 mM) was collected and checked for purity by SDS-PAGE. 

RESULTS 

Expression of TolAUI protein in E. coli 

Third domain of ToLAJII with tags (Figure 2) was expressed from three diflFerent expression 
vectors (Figure 3), pTolE, pTolT, and pTolX. In each case, the expression of ToLAIH was 
huge, sometimes reaching up to 40 % of all bacterial proteins (see Figure 3A). Specifically, 
the amount of expressed TolAIH from pTolT was 26.96 % ± 1.67 (n=5). The amount of 
expressed TolAUI was approximately the same regardless which vector was used. TolA 
e?qpressed in bacteria did not interfere with normal bacterial metabolism. The growth curve 
was very similar for induced and non-induced bacteria (Figure 3B). All of the TolAUI protein 
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was expressed in soluble form. No inclusion bodies were revealed by visual inspection of 
pelleted remains of bacteria after osmotic lysis, lysozyme treatment, sonication, and 
centrifugation. Furthermore, none of the TolAIH was found in insoluble cell fraction after 
fractionation of proteins from bacteria. Insoluble fraction represents membrane proteins and 
should contain also recombinant proteins in inclusion bodies figure 3C). Bacteria containing 
TolAin were a bit more fragile than normal. TolAm was released from the cells already after 
mild hypo-osmotic treatment, which should release only periplasmic proteins. 

Expression of other proteins in E. coli as fusions with TolAIII 

Nine proteins were tested in order to check the suitabiUty of pTol expression system for 
expression and preparation of other proteins (Table 1). These were different parts and domains 
of coUcin N (TolA binding box (peptide of amino acids 40-76), deletion mutant of T-domain 
(AlO) and R domain), representing prokaryotic proteins. Human phosphoUpase A2, pore- 
forming protein from sea anemone equinatoxin n, nucleotide binding domain 1 (NBDl) of 
human cystic fibrosis transniembrane conductance regulator (CFTR) and human 
ndtochondrial pyravate dehydrogenase kinase 2 (PDK2) were examples of eukaryotic 
proteins. Transmembrane proteins were represented by BcrC, a component of bacitracin 
resistance system from B. licheniformis, and transmembrane domain 1 (TMl) of human 
CFTR. Proteins chosen represent variations in size (app. 4.4 of cohcin 40-76 kDa v^. 44 kDa 
of PDK2), genetic code (prokaryotic v^. eukaryotic proteins), protein location (soluble v^. 
membrane), and disulphide content (PLA2, 7 disulphides vs. equinatoxin, none). Fusion 
proteins were expressed at high proportion in E. coli using pTol system (Figure 4). Again, the 
expression was as high as 40% in some cases, but average was around 20-25 % (see Figure 4B 
and C bottom panels). The only two exceptions were membrane proteins, BcrC and TMl. In 
this case a band corresponding to their size was lacking from the gel (Figure 4C). As opposed 
to expression of TolAHI alone, expression of frision proteins interferes with ttie growth of 
bacteria. In the case of PLA2 and membrane proteins, TMl and BcrC, the amount of bacteria 
at the end of the growth halved in some cases. Interestingly, expression of ftision of PDK2 in 
bacterial cell had positive effect and there was always slightly more bacteria at the end of the 
growth (not shown). Some of the bacteria expressing ftisions were ftirfher fractionated. PDK2 
and PLA2 were expressed as insoluble inclusion bodies. Eqtn and R-domain were found 
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mainly in the insoluble fraction, but some proportion was foimd also in cytoplasmic fraction 
(10-25 % of expressed proteins) (not shown). 

Isolation and cleavage effusion proteins 

Expressed fusions were isolated from the cytoplasm by simple extraction into buffered 
solution, which was applied onto Ni-NTA columa By this single step proteins were already 
more than 95 % pure figure 5). Yields of isolated fiisions were on average approximately 50 
mg/1 of bacterial broth, but reached iq> to 90 mg/1 (Table 2). Even proteins, which were mainly 
expressed as inclusion bodies, were isolated in significant quantities by this procedure, i.e. 11 
mg/ml of Eqtn fusion was isolated. One of the fusion proteins, ToIE-Tdomain 40-76, was 
used for the preparation of a peptide sample suitable for structmre determination by NMR. It 
was expressed in M9 iniiumal naedia containing ^^NH^ei. Even in minimal media it was 
possible to express and produce fusion at significant amounts, almost 70 mg of pure fusion 
was obtained per litre of bacterial culture. 



Table 2: Yields of isolated fusion proteins by using pTol system 



Protein' 


Yield 
(mg/I Iracterial brotli) 


ToIE-Tdomain 40-76 


46.7 


'*N ToIE-Tdomain 40-76 


67.1 


TolT-Tdomain 40-76 


83.8 


TolX-Tdomain 40-76 


89.6 


TolT-AlO Tdomain 


37.4 


TolT-Rdomain 


51 


TolE-Eqtn 


11 


TolT-PDK 


1.4 



* Proteins are named after plasmid used for expression of fusion protein. 



Pure R-domain was prepared from TolT-Rdomain fusion by cleavage with thrombin and 
separation of cleavage products by ion-exchange chromatography. The results of such 
purification scheme are presented on Figure 5. By the outlined procedure 13 mg of pure 
functional R domain was prepared fi-om 1 1 of starting bacterial culture. Slightly lower yield as 
expected firom the amount of soluble fusion is a consequence of R-domain precipitation during 
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the preparation. However, yield presented here is still more than two times higher than the 
system which provides directly expressed R-domain. 

DISCUSSION 

TolAin is expressed in huge quantities in soluble form in bacterial cytoplasm. Among the 
reasons for high expression of proteins in E. coli are most commonly cited ^propriate codon 
usage, stability of mRNA transcript, size, content of disulphide bonds, and toxicity to the cell. 
TolAin is small protein, with only one disulphide bond. It is very stable and monomeric in 
solution even at concentrations as high as 30 mg/ml (data from analytical ultracentrifiigation 
and gel filtration, not shown). The small size and tendency not to aggregate are certainly 
important in tolerance of heterologous naaterial in the cytoplasm of bacteria. A fiirther 
advantage of TolAHI gene is, that it is bacterial protein and as such it possesses only 5 codons 
(4.7 % of 106 amino acids excluding protease cleavage site) rarely transcribed in E. coli 
genome. They are scattered along the sequence. An improvement of its expression could be 
achieved by engineering of ttie conformation of its mRNA transcript. It was shown that, for a 
high yield of transcribed RNA, sometimes the conformation of RNA should be such, that the 
ribosome binding site and start codon should be exposed and not involved in base pairing. In 
the case of TolAIH mRNA both are involved in building short stems and not always 
completely exposed (analysis of transcribed RNAs of 60-120 nucleotides (step of 10 nt) by 
Mfold on ht^:/^ioinfo.math.ipi.edu/-<2:ukerm/). High expression of TolAIII protein in the T7 
based vector and the high yields of pure product are comparable or even better than published 
and existing systems for production of fiision proteins in E, coli. 

We have employed a domain of a periplasmic bacterial protein as a fusion partner in the 
overexpression of various proteins of bacterial and eukaryotic origin. Some small peptides or 
domains could be attached to TolAUI without significantly changing its size. The same 
amount of expressed protein would then be expected. In fact, the yield of fiision containing 
colicin N 40-76 peptide was the same as for TolAIE itself The system is suitable for the 
preparation of eukaryotic proteins as well. In particular, the level of expression of Eqtn is 
much more improved over the published one. Approximately 20 % of total expression of the 
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fusion contrasted with approximately 5 % in the case of direct expression. The majority of 
Eqtn expressed from the pTol system is in the insoluble fraction, but isolation of the soluble 
cytoplasmic fraction still resulted in a large improvement in yield over the published method. 
The pTol system might also be applicable for proteins expressed as inclusion bodies. For 
example, flie amount of expressed PLA2 is siinilar to othw expression systems, however the 
ftision protein can easily be isolated by Ni-NTA chromatography and then refolded and 
cleaved on the column matrix. An interesting observation was that the two membrane proteins 
studied did not express as ftision proteins with pTolA system, although the reason for this is 
unclear at the moment 

Three expression vectors were constracted providing three different cleavage sites for 
endopeptidases widely used in molecular biology, e.g. enterokinase, factor Xa and thronabin. 
Recognition sites for endopeptidases diffar in amino acid sequence and size. These differences 
dramatically change properties of the smaU TolAlII partner in ftision proteins (Table 3), 
TolAT and TolAX are basic, calculated pi more flian 8.5, TolAE is acid in nature, pi of 6.6. 
This is the result of foxir aspartates in the recognition sequence for enterokinase (DDDDK). 
The constructed vectors thus enable higher flexibUity, Le, one can easily choose appropriate 
vector on the basis of the properties of fiised partner. In our case, R-domain of colicin N was 
expressed in pTolT vector since R-domain is even more basic (pi 9.7) than cleaved ToLAJII. 
On the other hand, coUcin N peptide 40-76 has almost the same pi as TolAT or TolAX. This 
make subsequent purification much more diflScult, the peaks representing peptide and TolAHI 
would then overlap in ion-exchange chromatography. Therefore, peptide was expressed in 
pToIE. Cleaved TolAEH was not boimd to the column at chosen conditions and the difference 
in pi of the uncleaved ftision (pi 7.2) and peptide was large enough to get clearly resolved 
peaks (not shown). 



Table 3: Physical properties of TolAUI proteins after endoproteinase cleavage 



Protein' 


Amino acids 


Mw" 


Pl" 


TolAE 


111 


11716.1 


6.57 


TolAT 


110 


11593.2 


8.93 


TolAX 


110 


11S83.1 


8.57 



* Proteins are named according to Hie vector in which ^ey were produced. Calculated from the sequence. 
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We could produce functional parts of the coUcin N toxin by using the pTol expression system. 
We produced functional R-domain and 39 residue peptide composed of coUcin residues 40-76. 
His-tagged R-domain expresses poorly and irreproducibly and the tolA fusion expressed 
consisteaitly weU and improved the yield by more than two fold. Peptide was produced as 
labeUed sample for NMR stiiictiire determination. Preparation of large quantities of labeUed 
peptide sample for NMR stiiicture analysis can be problematic and a significant financial 
burden to research groups. High yields and versatiUty of the pTol system should make 
preparation of short peptides and proteins much cheaper and alternative to chemical synthesis 
and other expression systems. The system may be particularly useful for reproducible high 
level expression of smaU (<20 kDa) soluble proteins and unstinctured peptides. For example, 
the system might prove usefiil in tiie preparation of or labelled smaU peptides for 
NMR structural studies. 
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CLAIMS 

1. A fusion polypeptide for expression in a host cell comprising a TolAIH domain or a 
functional homologue, fragment, or derivative thereof and a non-TolA polypeptide excluding 
a polypeptide consisting of an amino acid sequence corresponding to residues 1-86 of mature 
phage minor coat gene 3 protein g3p. 

2. A fusion polypeptide for expression in a host cell comprising a TolAIII domain or a 
functional homologue, fragment, or derivative thereof and a non-TolA polypeptide, wherein 
the TolAin domain or functional homologue, fragment, or derivative thereof is located 
towards the N-terminus of the fusion polypeptide and the non-TolA polypeptide is located 
towards the C-terminus of the fusion polypeptide. 

3. The fusion protein according to claim 1, wherein the TolAin domain or functional 
homologue, fragment, or derivative thereof is located towards the C-terminus of the fusion 
polypeptide and tiie non-TolA polypeptide is located towards the N-terminus of the fusion 
polypeptide. 

4. The fusion polypeptide according to any one of the preceding claims, wherein the 
TolAin domain or functional homologue, fragment, or derivative thereof has been codon- 
optimised for expression in the host cell. 

5. The fusion polypeptide according to any one of the preceding claims, further 
comprising a linker between the TolAIH domain or functional homologue, fragment, or 
derivative tiiereof and the non-TolA polypeptide. 

6. The fusion polypeptide according to claim 5, wherein the linker comprises at least one 
cleavage site for an endopeptidase. 

7. The fusion polypeptide according to claim 6, wherein the cleavage site comprises the 
amino acid sequence DDDDK and/or LVPR and/or lEGR. 
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8. The fusion polypeptide according to any one of the preceding claims, further 
comprising an affinity purification tag. 

9. The fusion polypeptide according to claim 8, wherein tiie aflSnity purification tag is 
located at the N-temiinus of the fusion polypeptide. 

10. The fusion polypeptide according to claim 9, wherein the afiSnity purification tag is an 
N-terminal His„ tag, witii n=4, 5, 6, 7, 8, 9 orlO (preferably 6), optionaUy witii the His„ tag 
linked to the fusion polypeptide by one or more Ser residues (preferably 2). 

11. The fusion polypeptide according to any one of the preceding claims, wherein the 
TolAJH domain consists of amino acid residues 329-421 of Escherichia coli TolA sequence 
(SwissProt Acc. No. P19934). 

12. The fusion polypeptide according to any one of the preceding claims, wherein the host 
cell is bacterial (for example, Escherichia coli). 

13. A DNA molecule encoding the fusion polypeptide as defined in any one of claims 1- 
12. 



14. A DNA molecule according to claim 13, wherein the mRNA properties of the DNA 
molecule when transcribed are optimised for ejqpression in the host cell. 

15. An expression vector comprising the DNA molecule according to either one of claim 
13 or claim 14 for expression of the fusion polypeptide defined in any one of claims 1-12. 

16. The expression vector according to claim 15, having an inducible promoter (for 
example, the IPTG-inducible T7 promoter) which drives expression of the fusion polypeptide. 

17. The expression vector according to either one of claim 15 or claim 16, having an 
antibiotic resistance marker (for example, the bla gene, which confers resistance to ampicillin 
and chloramphenicol). 
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18. A cloning vector for producing the expression vector defined in any one of claims 15- 
17, comprising DNA encoding the ToLAHI domain or a functional homologue, fragment, or 
derivative thereof upstream or downstream from a cloning site which allows in-frame 
insertion of DNA encoding a non-TolA polypeptide. 

19. The cloning vector according to claim 18, ftirther comprising DNA encoding at least 
one cleavage site (for example, the amino acid sequence DDDDK and/or LVPR and/or lEGR) 
for an endopeptidase, the cleavage site located between the DNA encoding the TolAHI 
domain or a frmctional homologue, fragment, or derivative thereof and the cloning site. 

20. The cloning vector according to either one of claims 18 or 19, wherein the cloning site 
comprises at least one restriction endonuclease (for example, BarnHL and/or Kpnl) target 
sequence. 

21. The cloning vector according to any one of claims 18-20, ftirther comprising DNA 
encoding an affinity purification tag as defined in eith^ one of claim 8 or claim 9. 

22. The cloning vector according to any one of claims 18-21, further comprising an 
inducible promoter (for example, the IPTG-inducible T7 promotor). 

23. The cloning vector according to any one of claims 18-22, further comprising DNA 
encoding an antibiotic resistance marker (for example, the bla gene, which confers resistance 
to ampicillin and chloramphenicol). 

24. The cloning vector according to any one of claims 18-23, having the structure of 
pToIE, pTolT or pToDC (as shown in Figure 2 with reference to the description). 

25. Use of the TolAHI domain or functional homologue, fragment, or derivative thereof 
for production of a fusion polypeptide as defined in any one of claims 1-12. 
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26. Use of the TolAin domain or fimctional homologue, fragment, or derivative thereof 
for production of the DNA molecule as defined in either one of claim 13 or claim 14. 

27. Use of the TolAm domain or functional homologue, fragment, or derivative thereof 
for production of an expression vector as defined in any one of claims 15-17. 

28. Use of the TolAUI domain or frmctional homologue, fiiagment, or derivative thereof 
for production of a cloning vector as defined in any one of claims 18-24. 

29. A host cell containing the DNA as defined in claim 12 and/or the expression vector as 
defined in any one of clain[is 15-17 and/or the cloning vector as defined in any one of clai3tns 
18-24. 

30. Use of the ftision polypeptide as defined in any one of claims 5-12 for immobilisation 
of the non-TolA polypeptide, comprising the step of: 

binding the fiision polypeptide to a TolA binding polypeptide (eg. the TolA-recognition site of 
colicin N or other colicins, the TolA binding region of bacteriophage g3p-Dl protein, or the 
TolA binding region of ToB or other Tol proteins). 

31. Use of the fusion polypeptide as defined in any one of claims 8-12 for inunobilisation 
of the non-TolA polypeptide, comprising the step of: 

binding the affinity tag of the fusion polypeptide to a binding moiety. 

32. Use of the fusion polypeptide as defined in any one of claims 5-12 for purification and 
isolation of the non-TolA polypeptide, comprising the steps of: 

bindiag the fusion polypeptide to a TolA. binding polypeptide (eg. the ToLA-recognition site of 
colicin N or other colicins, the TolA binding region of bacteriophage g3p-Dl protein, or the 
ToLA. binding region of TolB or other Tol proteins); 

cleaving the non-ToLA polypeptide from the ToLAHI domain or functional homologue, 
fragment, or derivative thereof using an endopeptidase; and 

separating the cleaved non-TolA polypeptide from the TolAEDE domain or functional 
homologue, fragment, or derivative th^eof. 
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33. Use of the fusion polypeptide as defined in any one of claims 8-12 for purification and 
isolation of the non-TolA polypeptide, comprising the steps of: 

binding the affinity tag of the fusion polypeptide to a binding moiety; 

cleaving the non-TolA polypeptide fmm the TolAm domain or fimctional homologue, 
firagment, or derivative thereof using an endopeptidase; and 

separating the cleaved non-TolA polypeptide ftom the TolAin domain or fimctional 
homologue, fragment, or derivative thereof. 

34. Use of the fiision polypeptide as defined in any one of claims 1-12 for studying 
interaction properties of the non-TolA polypeptide or the fiision polypeptide, for example self- 
intCTaction, interaction with another molecule, or interaction with a physical stunulus. 

35. A method for high expression of a polypeptide as a fiision polypeptide in a host cell, 
comprising the step of e3q)ressing the polypeptide as a fiision polypeptide as defined in any 
one of claims 1-12 in a host cell. 
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GSGNTKNN GASGADINNY AGQIKSAIES KFYDASSYAG KTCTLRIKLA EGGDPALCQA 
ALAAAKLAKI PKPPSQAVYE VFKNAPLDFK P 



Figure 1 (Prior Art> 
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pTolB ggt ggg gga t:ct; gat gat gae gat aaa ggra tee ggrt aoo tga tga acgr cgt 

GOGS D D D D K tOSOT**TR 



BamHX iqpnl Mlul 
pTolT ggt ggg gga tot otg gtt cog ogc ggra taa gigto aeo tga tga aogr ogrt 

6 O O B Ii V P R tGS6T**TR 



BaznHX KpnX Mluz 
pTolX ggt ggg gga tot att gaa ggt ego gg-a taa ggrfe aeo tga tga aeg ogt 

G G G 8 I E Q R tOSO*r«*TR 



Figure 2 
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Figure 5 
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